filer: switch workspace upload from import-file to /workspace/import by shreyas-goenka · Pull Request #5165 · databricks/cli

shreyas-goenka · 2026-05-04T08:31:02Z

Summary

Replace POST /api/2.0/workspace-files/import-file/{path} with the multipart variant of POST /api/2.0/workspace/import (via the SDK's Workspace.Upload + format=AUTO). The previous endpoint is being deprecated and the new endpoint is the strategic replacement.

Why

The import-file file API is being deprecated.
In practice /worskapce/import is 2x faster in most cases.
The import API is well supported and returns better errors.

Why multipart

/workspace/import's JSON body caps at 10 MiB (databricks.webapp.autoExportFormatLimitBytes). The multipart variant accepts the same sizes import-file did — verified up to 450 MB for regular files against a real workspace.

Throughput (`aws-prod-ucws`, 10 workers, 4 KiB files)

Endpoint	Files	Wall	files/s	Speedup
new	10	1.01 s	9.94	0.86× (legacy faster)
legacy	10	0.86 s	11.58	—
new	20	1.37 s	14.61	2.22×
legacy	20	3.04 s	6.58	—
new	50	2.72 s	18.40	1.74×
legacy	50	4.73 s	10.57	—
new	100	6.12 s	16.34	2.34×
legacy	100	14.31 s	6.99	—
new	300	19.39 s	15.48	1.81×
legacy	300	35.16 s	8.53	—
new	500	34.69 s	14.41	2.27×
legacy	500	1 m 18.51 s	6.37	—

Crossover at ~15 files. New endpoint sustains ~15 files/s; legacy degrades from 17 → 6 files/s as count grows.

Size caps (verified)

	Files (no header)	Notebooks (`# Databricks notebook source`)
`/workspace/import` (this PR, multipart)	OK at 450 MB	hard 10 MiB cap
`/workspace-files/import-file` (previous)	400 MB; 450 MB → 305 s server timeout	hard 10 MiB cap (same `databricks.notebook.maxNotebookSizeBytes`)

The 10 MiB notebook cap is server-side and applies to both endpoints — switching from /workspace-files/import-file to /workspace/import does not regress maximum notebook upload size.

Error mapping

errors.Is against SDK sentinels rather than raw status/error_code dispatch. All "path is occupied" surfaces map to fileAlreadyExistsError:

400 RESOURCE_ALREADY_EXISTS — sequential conflict, no overwrite flag.
409 ALREADY_EXISTS — concurrent contention (observed in TestLock).
400 INVALID_PARAMETER_VALUE (with "type mismatch" / "node type" message) — overwrite=true on a path where the existing object's node type differs from the upload (e.g. NOTEBOOK at /foo vs incoming regular content).

Plus ErrNotFound → noSuchDirectoryError (or mkdir+retry under CreateParentDirectories); ErrPermissionDenied → permissionError.

Test plan

Unit tests cover all three already-exists shapes + 403 / 500.
libs/testserver extended with multipart /workspace/import, notebook header detection for .py/.sql/.scala/.r, and /workspace/list.
acceptance/bundle/sync-upload-edge-cases exercises 12 MiB binary, empty file, 3 notebook languages (python/sql/scala), plain .py (no header), .lvdash.json dashboard, non-ASCII filename, spaced filename — asserts via workspace list that each lands with the right object_type + language and via jq that recorded multipart bodies pin paths/contents.
~70 acceptance fixtures regenerated.
Integration test TestFilerWorkspaceNotebook assertion updated to assert path with extension (tc.name); same change as filer: detect notebook already-exists across both error formats #5106.
All cross-type collision shapes (3) verified by direct probe against a real workspace.
Load tested vs the previous endpoint at concurrencies 1/10/20/30 (1 KB) and at sizes up to 450 MB.
CI green on this PR.

This pull request and its description were written by Isaac.

shreyas-goenka · 2026-05-08T06:49:58Z

+	//    Example: "Node with name /Users/me/.bundle/.../deploy.lock already
+	//    exists. Please pass overwrite=true to update it."
+	//
+	//  - 400 INVALID_PARAMETER_VALUE — overwrite=true on a path where the


This error is not good. This then requires us to do regex on the error to figure out whether it's "already exists". I'm asking the API team to change this.

shreyas-goenka · 2026-05-08T06:51:26Z

Benchmark results on the new API vs old API, the new API is 2x better when a lot of files are being uploaded:

  ┌──────────┬───────┬─────────────┬─────────┬───────────────────────┐
  │ Endpoint │ Files │    Wall     │ files/s │   Speedup vs legacy   │
  ├──────────┼───────┼─────────────┼─────────┼───────────────────────┤
  │ new      │    10 │      1.19 s │    8.37 │ 0.50× (legacy faster) │
  ├──────────┼───────┼─────────────┼─────────┼───────────────────────┤
  │ legacy   │    10 │      0.61 s │   16.45 │                     — │
  ├──────────┼───────┼─────────────┼─────────┼───────────────────────┤
  │ new      │   100 │      6.04 s │   16.56 │                  3.9× │
  ├──────────┼───────┼─────────────┼─────────┼───────────────────────┤
  │ legacy   │   100 │     23.55 s │    4.25 │                     — │
  ├──────────┼───────┼─────────────┼─────────┼───────────────────────┤
  │ new      │   300 │     19.36 s │   15.49 │                  3.2× │
  ├──────────┼───────┼─────────────┼─────────┼───────────────────────┤
  │ legacy   │   300 │ 1 m 01.62 s │    4.87 │                     — │
  ├──────────┼───────┼─────────────┼─────────┼───────────────────────┤
  │ new      │   500 │     32.74 s │   15.27 │                  3.0× │
  ├──────────┼───────┼─────────────┼─────────┼───────────────────────┤
  │ legacy   │   500 │ 1 m 38.62 s │    5.07 │                     — │
  └──────────┴───────┴─────────────┴─────────┴───────────────────────┘

  (10 workers, 4 KiB files, aws-prod-ucws.)

  Findings:
  - Crossover at ~50 files: legacy beats new for ≤ 10 files; new is 3-4× faster from 100+ on.
  - Legacy throughput collapses: 17 → 4-5 files/s as N grows, server applies escalating back-pressure.
  - New throughput is flat at ~15 files/s regardless of count — no degradation up through 500 files.
  - For typical bundle deploys (~100-1000 files), expect a 3-4× wall-clock speedup.

Replace `POST /api/2.0/workspace-files/import-file/{path}?overwrite=…` with the multipart variant of `POST /api/2.0/workspace/import` (via the SDK's `Workspace.Upload` + `format=AUTO`). The legacy endpoint is being deprecated and the new endpoint is the strategic replacement. The multipart variant is required because the JSON body of /workspace/import is server-capped at 10 MiB; multipart accepts the same sizes import-file did (verified up to 250 MiB against a real workspace), so DAB users shipping wheels/jars/large files keep working. Error mapping uses SDK sentinels via errors.Is rather than raw status/error code comparisons: - ErrNotFound → noSuchDirectoryError (or mkdir+retry under CreateParentDirectories mode); also catches RESOURCE_DOES_NOT_EXIST. - ErrResourceAlreadyExists → fileAlreadyExistsError (the new endpoint reliably sets error_code RESOURCE_ALREADY_EXISTS). - ErrInvalidParameterValue + "Requested node type" message → fileAlreadyExistsError (existing object's node type doesn't match the upload — file vs notebook collision). - ErrPermissionDenied → permissionError. Apply the same sentinel-based pattern to Delete, ReadDir, and Stat for 404 detection, matching the existing usage in bundle/direct/util.go and following AGENTS.md's rule against branching on err.Error() string content. DIRECTORY_NOT_EMPTY in Delete keeps an explicit ErrorCode check since the SDK has no sentinel for it. Test plan: - libs/filer/workspace_files_client_test.go covers the new error mapping. - libs/testserver/handlers.go extended with a multipart handler at /workspace/import that surfaces 409s from the shared fake as 400 + RESOURCE_ALREADY_EXISTS to match real-workspace shape. - acceptance/internal/prepare_server.go normalizes multipart bodies (sorted form fields, file parts recorded as {filename, size}) so request fixtures stay deterministic. - ~70 acceptance fixtures regenerated for the new request shape. - End-to-end verified against a real workspace for files, all notebook types (.py / .sql / .ipynb / .lvdash.json / .scala / .r), dashboards, and a 60 MB binary upload. - Integration test TestFilerWorkspaceNotebook assertion updated to assert the path with extension (tc.name) — matches the absPath returned in fileAlreadyExistsError. Same change as #5106. Co-authored-by: Isaac

…ted goldens Add a focused acceptance test (acceptance/bundle/sync/upload-edge-cases) that exercises the multipart upload pipeline with the inputs that differ in shape between the legacy import-file and the new /workspace/import endpoints: - A 12 MiB binary file (above the JSON-body 10 MiB cap that the multipart variant lifts). - An empty file (multipart encodes an empty content part distinct from JSON's empty string). - A python notebook (auto-detected as NOTEBOOK; testserver mirrors extension stripping). - A .lvdash.json dashboard descriptor (real workspace assigns DASHBOARD; testserver records the upload-side request shape). - A non-ASCII filename (héllo.txt — multipart encodes filenames with quoted string rules). - A filename with a space. Assertions inspect out.requests.txt and pin: the set of multipart_form.path values, that every upload sets format=AUTO. Also restore 11 golden out.requests.txt files that the previous regenerate sweep accidentally deleted (templates/telemetry/*, run/inline-script/**, run/scripts/**, resources/volumes/change-schema-name) — they were present on main and silently disappeared during the rebase that brought their goldens together with the removal patch. Co-authored-by: Isaac

… writes) The /workspace/import endpoint reports an existing path with two different error codes depending on contention: - 400 RESOURCE_ALREADY_EXISTS: sequential conflict (overwrite=false on existing path) Message: "<path> already exists. Please pass overwrite=true to overwrite it." - 409 ALREADY_EXISTS: concurrent contention (e.g. multiple deploy locks racing) Message: "Node with name <path> already exists. Please pass overwrite=true to update it." Verified by direct probe against aws-prod and aws-prod-ucws — sequential conflicts produce the first shape, but a 5-goroutine race on the same path produces the second shape with status 409 and error_code ALREADY_EXISTS. The original PR's errors.Is(err, ErrResourceAlreadyExists) only catches the first shape, leaving locker.Lock() returning the raw API error instead of falling through to assertLockHeld() — which is what TestLock asserts. Add errors.Is(err, ErrAlreadyExists) for the second shape; both inherit from apierr.ErrResourceConflict but we use the specific sentinels rather than the broader parent (ErrResourceConflict also covers ErrAborted, which is a different concept). Co-authored-by: Isaac

The os.Getenv check was added in this PR's first commit but is never set in any CI config or test environment. The pre-existing !WorkspaceTmpDir guard already covers the actual problem (yamlfmt build fails on DBR workspace filesystems). The env var was an unused escape hatch. Co-authored-by: Isaac

…ontent]" placeholder) Multipart parts in recorded fixtures used to be summarized as the literal string "[content]" regardless of what was uploaded. Reviewers couldn't see the actual deploy.lock / deployment.json / app.yml payloads, only that *something* was sent. Show the literal content for valid UTF-8 parts; existing global UNIX_TIME / UUID / USERNAME / DEV_VERSION / TIMESTAMP replacements normalize the dynamic fields so deploy.lock and similar payloads still produce stable diffs. Binary content and parts above 4 KiB still collapse to "[binary content N bytes]" so a 12 MiB wheel upload doesn't bloat the recording. Update the upload-edge-cases acceptance test to assert on the recorded content for each small text input (notebook source, lakeview dashboard JSON, "hello, world", "hello, naïve world", empty file) and confirm the 12 MiB binary collapses. Regenerated affected goldens under acceptance/bundle/{user_agent,validate, artifacts,libraries,apps}/. Co-authored-by: Isaac

Both the JSON-body and multipart variants of /api/2.0/workspace/import are capped — they share the same /workspace/import path but enforce different caps. Per Confluence "Workspace Import Export Endpoints": - JSON body, AUTO format: 10 MiB (databricks.webapp.autoExportFormatLimitBytes) - Multipart: 200 MiB (databricks.workspaceFilesystem.maxImportSizeBytes) - Legacy import-file: 200 MiB (same Armeria server limit) So multipart is the right replacement for the legacy endpoint not because it "handles arbitrary size", but because its 200 MiB cap matches the legacy endpoint we are migrating away from. The previous comment was wrong on both counts (no arbitrary size, and the 10 MiB cap is the JSON-body variant of /workspace/import, not the legacy /workspace-files/import-file endpoint). Co-authored-by: Isaac

Investigation against a real workspace (aws-prod-ucws) and Confluence page 4867063896 ("Workspace Import Export Endpoints"): - The 200 MiB cap (databricks.workspaceFilesystem.maxImportSizeBytes) gates the legacy /workspace-files/import-file endpoint (Projects service / Armeria), not the new /workspace/import multipart variant (Webapp). The multipart endpoint accepts at least 450 MB of regular-file content empirically. - The legacy endpoint additionally enforces a 305s server-side request timeout, which becomes the binding constraint at typical client bandwidth for files in the ~400-500 MB range. There is no client-side knob to extend it (databricks-sdk-go's HTTPTimeout is an inactivity timer that resets on every body read; http.Client.Timeout is left at 0). - Notebook content (payloads with the `# Databricks notebook source` header, detected via format=AUTO) hits a separate 10 MiB cap on the server (databricks.notebook.maxNotebookSizeBytes) — enforced on both endpoints, so the migration is not a regression on that axis. Co-authored-by: Isaac

A future reader of this comment can't tell what "this migration" or "the endpoint we are migrating away from" refers to once the PR commit is in the past. Spell out /workspace-files/import-file → /workspace/import where it matters, and re-anchor the no-regression note on those endpoint names rather than "this migration". Co-authored-by: Isaac

…xisting sentinels) Verified against a real workspace by uploading varied content into pre-existing paths of mismatched node types: every cross-type collision the server can produce surfaces as either: - 400 RESOURCE_ALREADY_EXISTS, or - 409 ALREADY_EXISTS both already caught by errors.Is(err, ErrResourceAlreadyExists) / errors.Is(err, ErrAlreadyExists). The "400 INVALID_PARAMETER_VALUE with 'Requested node type [X] is different from the existing node type [Y]'" shape was hypothesized in the original PR but does not occur in practice on /workspace/import — the branch was dead code. Replace the corresponding unit test with one that pins the actual cross-type collision shape (409 ALREADY_EXISTS with "Node with name X already exists"). Co-authored-by: Isaac

…rwrite=true) Previous commit (687c731) called the INVALID_PARAMETER_VALUE branch dead code based on conflict scenarios that didn't pass overwrite=true. Re-probing with overwrite=true (the mode bundle deploy uses) shows the server does return 400 INVALID_PARAMETER_VALUE for cross-type collisions, in two distinct message shapes: "Cannot overwrite the asset at X due to type mismatch (asked: ..., actual: ...)" "Requested node type [...] is different from the existing node type [...]" Both fire only when the existing object's node type differs from the upload (NOTEBOOK at /a/foo, regular-content overwrite-upload to /a/foo or /a/foo.py). Without overwrite=true, the same scenario surfaces as RESOURCE_ALREADY_EXISTS or ALREADY_EXISTS — which is why the no-overwrite probe missed this branch. Restore the mapping (returning fileAlreadyExistsError so the caller's errors.Is(err, fs.ErrExist) check still fires) and broaden the message pattern from "Requested node type" to also match "type mismatch", so both real-workspace messages are caught. Add unit tests for both shapes. Co-authored-by: Isaac

…amples Drop the abstract "Sources for the third bullet" footer and inline a concrete example message for each of the three error shapes. Each bullet now shows exactly what the server returns and the scenario that triggers it, so a future reader doesn't have to re-derive which path/state combination produces each shape. Co-authored-by: Isaac

The previous regen sweep widened the filter from '^//import-file/' '^//workspace/delete' (two narrow exclusions) to '^//workspace/' (catch-all under /workspace/*). That hid the workspace/mkdirs calls the test was previously asserting on. Restore the original intent: only exclude the upload + delete requests, preserving the mkdirs assertions. Also pick up the out.test.toml regeneration for the force_pull_commands test that landed on main (5028) to switch the [Section] format to dotted keys. Co-authored-by: Isaac

…l + scala notebooks Extend the testserver to detect notebook headers for .sql, .scala, .r in addition to .py (each with its language-appropriate line-comment) and to serve GET /api/2.0/workspace/list — so acceptance tests can assert what the server actually stored, not just what was sent on the wire. Extend acceptance/bundle/sync-upload-edge-cases to upload pyNb.py, sqlNb.sql, scalaNb.scala (all with the source header), plus plain-script.py (no header), and assert via "databricks workspace list" that: - pyNb → NOTEBOOK / PYTHON (extension stripped) - sqlNb → NOTEBOOK / SQL (extension stripped) - scalaNb → NOTEBOOK / SCALA (extension stripped) - plain-script.py → FILE (header missing, extension preserved) - dashboard.lvdash.json, large.bin, empty.txt, héllo.txt, with spaces.txt all → FILE Co-authored-by: Isaac

…dge-cases The fact that the test produces deterministic output already proves the 12 MiB binary isn't recorded byte-for-byte (otherwise the recording would contain literal megabytes of payload). The dedicated "12 MiB binary upload is summarized" assertion was implementation-detail noise. Co-authored-by: Isaac

github-actions · 2026-05-08T08:07:23Z

Approval status: pending

`/acceptance/bundle/` - needs approval

72 files changed
Suggested: @denik
Also eligible: @andrewnester, @pietern, @janniklasrose, @anton-107, @lennartkats-db

`/libs/filer/` - needs approval

Files: libs/filer/workspace_files_client.go, libs/filer/workspace_files_client_test.go
Suggested: @simonfaltum
Also eligible: @renaudhartert-db, @hectorcast-db, @parthban-db, @tanmay-db, @Divyansh-db, @tejaskochar-db, @mihaimitrea-db, @chrisst, @rauchy

General files (require maintainer)

Files: acceptance/internal/prepare_server.go, libs/testserver/fake_workspace.go, libs/testserver/handlers.go
Based on git history:

@denik -- recent work in libs/testserver/, libs/filer/, acceptance/internal/

_{Any maintainer (@andrewnester, @anton-107, @denik, @pietern, @simonfaltum, @renaudhartert-db) can approve all areas.

See OWNERS for ownership rules.}

shreyas-goenka temporarily deployed to test-trigger-is May 4, 2026 08:31 — with GitHub Actions Inactive

shreyas-goenka force-pushed the shreyas-goenka/import-api branch from 0a9923f to e86aeba Compare May 6, 2026 10:38

shreyas-goenka temporarily deployed to test-trigger-is May 6, 2026 10:38 — with GitHub Actions Inactive

shreyas-goenka temporarily deployed to test-trigger-is May 6, 2026 11:11 — with GitHub Actions Inactive

shreyas-goenka force-pushed the shreyas-goenka/import-api branch from 4db3fb6 to 6907378 Compare May 6, 2026 11:26

shreyas-goenka temporarily deployed to test-trigger-is May 6, 2026 11:27 — with GitHub Actions Inactive

shreyas-goenka temporarily deployed to test-trigger-is May 6, 2026 12:55 — with GitHub Actions Inactive

shreyas-goenka force-pushed the shreyas-goenka/import-api branch 2 times, most recently from 37a1400 to 1e2fe83 Compare May 6, 2026 14:30

shreyas-goenka temporarily deployed to test-trigger-is May 6, 2026 14:31 — with GitHub Actions Inactive

shreyas-goenka temporarily deployed to test-trigger-is May 6, 2026 15:45 — with GitHub Actions Inactive

shreyas-goenka temporarily deployed to test-trigger-is May 7, 2026 10:00 — with GitHub Actions Inactive

shreyas-goenka temporarily deployed to test-trigger-is May 8, 2026 06:17 — with GitHub Actions Inactive

shreyas-goenka temporarily deployed to test-trigger-is May 8, 2026 06:22 — with GitHub Actions Inactive

shreyas-goenka temporarily deployed to test-trigger-is May 8, 2026 06:27 — with GitHub Actions Inactive

shreyas-goenka temporarily deployed to test-trigger-is May 8, 2026 06:36 — with GitHub Actions Inactive

shreyas-goenka commented May 8, 2026

View reviewed changes

shreyas-goenka added 12 commits May 8, 2026 08:56

shreyas-goenka force-pushed the shreyas-goenka/import-api branch from 5dd5be7 to d4e962c Compare May 8, 2026 07:11

shreyas-goenka temporarily deployed to test-trigger-is May 8, 2026 07:12 — with GitHub Actions Inactive

shreyas-goenka temporarily deployed to test-trigger-is May 8, 2026 07:37 — with GitHub Actions Inactive

shreyas-goenka temporarily deployed to test-trigger-is May 8, 2026 08:05 — with GitHub Actions Inactive

shreyas-goenka marked this pull request as ready for review May 8, 2026 08:06

shreyas-goenka requested review from andrewnester, denik and pietern May 8, 2026 08:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

filer: switch workspace upload from import-file to /workspace/import#5165

filer: switch workspace upload from import-file to /workspace/import#5165
shreyas-goenka wants to merge 14 commits intomainfrom
shreyas-goenka/import-api

shreyas-goenka commented May 4, 2026 •

edited

Loading

Uh oh!

shreyas-goenka May 8, 2026

Uh oh!

shreyas-goenka commented May 8, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

shreyas-goenka commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Why multipart

Throughput (aws-prod-ucws, 10 workers, 4 KiB files)

Size caps (verified)

Error mapping

Test plan

Uh oh!

shreyas-goenka May 8, 2026

Choose a reason for hiding this comment

Uh oh!

shreyas-goenka commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 8, 2026

Approval status: pending

/acceptance/bundle/ - needs approval

/libs/filer/ - needs approval

General files (require maintainer)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

shreyas-goenka commented May 4, 2026 •

edited

Loading

Throughput (`aws-prod-ucws`, 10 workers, 4 KiB files)

shreyas-goenka commented May 8, 2026 •

edited

Loading

`/acceptance/bundle/` - needs approval

`/libs/filer/` - needs approval